multi-distribution dre
A Unified Framework for Multi-distribution Density Ratio Estimation
Yu, Lantao, Jin, Yujia, Ermon, Stefano
Such a generalization leads to important new applications such as estimating statistical discrepancy among multiple random variables like multi-distribution f-divergence, and bias correction via multiple importance sampling. We then develop a general framework from the perspective of Bregman divergence minimization, where each strictly convex multivariate function induces a proper loss for multi-distribution DRE. We show that our framework leads to methods that strictly generalize their counterparts in binary DRE, as well as new methods that show comparable or superior performance on various downstream tasks. It is such a powerful paradigm because computing density ratio focuses on extracting and preserving contrastive information between two distributions, which is crucial in many tasks. Despite the tremendous success of binary DRE, many applications involve more than two probability distributions and developing density ratio estimation methods among multiple distributions has the potential of advancing various applications such as estimating multi-distribution statistical discrepancy measures (Garcia-Garcia & Williamson, 2012), multi-domain transfer learning, bias correction and variance reduction with multiple importance sampling (Elvira et al., 2019), multi-marginal generative modeling (Cao et al., 2019) and multilingual machine translation (Dong et al., 2015; Aharoni et al., 2019).